Goto

Collaborating Authors

 blonde hair


Semantic Token Reweighting for Interpretable and Controllable Text Embeddings in CLIP

arXiv.org Artificial Intelligence

A text encoder within Vision-Language Models (VLMs) like CLIP plays a crucial role in translating textual input into an embedding space shared with images, thereby facilitating the interpretative analysis of vision tasks through natural language. Despite the varying significance of different textual elements within a sentence depending on the context, efforts to account for variation of importance in constructing text embeddings have been lacking. We propose a framework of Semantic Token Reweighting to build Interpretable text embeddings (SToRI), which incorporates controllability as well. SToRI refines the text encoding process in CLIP by differentially weighting semantic elements based on contextual importance, enabling finer control over emphasis responsive to data-driven insights and user preferences. The efficacy of SToRI is demonstrated through comprehensive experiments on few-shot image classification and image retrieval tailored to user preferences.


LADDER: Language Driven Slice Discovery and Error Rectification

arXiv.org Artificial Intelligence

Error slice discovery associates structured patterns with model errors. Existing methods discover error slices by clustering the error-prone samples with similar patterns or assigning discrete attributes to each sample for post-hoc analysis. While these methods aim for interpretability and easier mitigation through reweighting or rebalancing, they may not capture the full complexity of error patterns due to incomplete or missing attributes. Contrary to the existing approach, this paper utilizes the reasoning capabilities of the Large Language Model (LLM) to analyze complex error patterns and generate testable hypotheses. This paper proposes LADDER: Language Driven slice Discovery and Error Rectification. It first projects the model's representation into a language-aligned feature space (eg CLIP) to preserve semantics in the original model feature space. This ensures the accurate retrieval of sentences that highlight the model's errors. Next, the LLM utilizes the sentences and generates hypotheses to discover error slices. Finally, we mitigate the error by fine-tuning the classification head by creating a group-balanced dataset using the hypotheses. Our entire method does not require any attribute annotation, either explicitly or through external tagging models. We validate our method with \textbf{five} image classification datasets. The code is available (https://github.com/batmanlab/Ladder).


When Love and the Algorithm Don't Mix

TIME - Tech

When I met my husband, who happens to be white, he told me that he was always seeing women with blonde hair on Tinder and he's not really into blondes. No matter how many times he had swiped left on blondes, the algorithms were always recommending them to him, presumably because pop culture dictates that white men prefer blondes. Luckily for us, the algorithms' tendency to stack blonde women in his swipe deck worked out in our favor because I'm a black woman who, at the time, had blonde hair. In nearly 10 years of swiping through profiles on Tinder, Bumble, Hinge, and OkCupid, I learned that dating apps can provide pathways for finding friendship, adventure, romance, and sometimes, love. But there was one aspect of dating app culture that I couldn't ignore because it was often the first thing matches wanted to talk about: race.